Portfolio Choices with Orthogonal Bandit Learning
نویسندگان
چکیده
The investigation and development of new methods from diverse perspectives to shed light on portfolio choice problems has never stagnated in financial research. Recently, multi-armed bandits have drawn intensive attention in various machine learning applications in online settings. The tradeoff between exploration and exploitation to maximize rewards in bandit algorithms naturally establishes a connection to portfolio choice problems. In this paper, we present a bandit algorithm for conducting online portfolio choices by effectually exploiting correlations among multiple arms. Through constructing orthogonal portfolios from multiple assets and integrating with the upper confidence bound bandit framework, we derive the optimal portfolio strategy that represents the combination of passive and active investments according to a risk-adjusted reward function. Compared with oft-quoted trading strategies in finance and machine learning fields across representative real-world market datasets, the proposed algorithm demonstrates superiority in both risk-adjusted return and cumulative wealth.
منابع مشابه
Risk-aware multi-armed bandit problem with application to portfolio selection
Sequential portfolio selection has attracted increasing interest in the machine learning and quantitative finance communities in recent years. As a mathematical framework for reinforcement learning policies, the stochastic multi-armed bandit problem addresses the primary difficulty in sequential decision-making under uncertainty, namely the exploration versus exploitation dilemma, and therefore...
متن کاملCooperation control in Parallel SAT Solving: a Multi-armed Bandit Approach
In recent years, Parallel SAT solvers have leveraged with the so called Parallel Portfolio architecture. In this setting, a collection of independent ConflictDirected Clause Learning (CDCL) algorithms compete and cooperate through Clause Sharing. However, when the number of cores increases, systematic clause sharing between CDCLs can slow down the search performance. Previous work has shown how...
متن کاملAlgorithm selection of reinforcement learning algorithms
Dialogue systems rely on a careful reinforcement learning (RL) design: the learning algorithm and its state space representation. In lack of more rigorous knowledge, the designer resorts to its practical experience to choose the best option. In order to automate and to improve the performance of the aforementioned process, this article tackles the problem of online RL algorithm selection. A met...
متن کاملMulti-Armed Bandits for Addressing the Exploration/Exploitation Trade-off in Self Improving Learning Environment
This project proposes the use of machine learning techniques such as Multi-Armed Bandits to implement self-improving learning environments. The goal of a self-improving learning environment is to perform good pedagogical choices while measuring the efficiency of these choices. The modeling of students is done using the LFA model and fitted on a dataset of university courses to allow to simulate...
متن کاملAutomatically Reinforcing a Game AI
A recent research trend in Artificial Intelligence (AI) is the combination of several programs into one single, stronger, program; this is termed portfolio methods. We here investigate the application of such methods to Game Playing Programs (GPPs). In addition, we consider the case in which only one GPP is available by decomposing this single GPP into several ones through the use of parameters...
متن کامل